Serveur d'exploration sur l'OCR

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Coarse classification of Chinese characters via stroke clustering method

Identifieur interne : 002A01 ( Main/Exploration ); précédent : 002A00; suivant : 002A02

Coarse classification of Chinese characters via stroke clustering method

Auteurs : Chin-Chuan Han [République populaire de Chine] ; Yao-Lung Tseng [République populaire de Chine] ; Kuo-Chin Fan [République populaire de Chine, Taïwan] ; An-Bang Wang [République populaire de Chine]

Source :

RBID : ISTEX:1810EFB0B5D295E915FC26E7B2C76832A476B30E

Abstract

In this paper, we propose a stroke clustering-based coarse classification mechanism to classify the multi-fonts Chinese characters. The main purpose of the proposed method is to identify the associating type of an input character together with the extraction of its embedded composing components. In this paper, the K-mean clustering algorithm is employed to cluster the thinned strokes. Besides, mis-clustered stroke modification techniques are developed to rearrange the mis-clustered strokes generated by the K-mean algorithm. Five kinds of fonts for 2500 frequently used Chinese characters are tested in our experiments. The average classification rate is 92.57% which is very promising for coarse classification.

Url:
DOI: 10.1016/0167-8655(95)00054-K


Affiliations:


Links toward previous steps (curation, corpus...)


Le document en format XML

<record>
<TEI wicri:istexFullTextTei="biblStruct">
<teiHeader>
<fileDesc>
<titleStmt>
<title>Coarse classification of Chinese characters via stroke clustering method</title>
<author>
<name sortKey="Han, Chin Chuan" sort="Han, Chin Chuan" uniqKey="Han C" first="Chin-Chuan" last="Han">Chin-Chuan Han</name>
</author>
<author>
<name sortKey="Tseng, Yao Lung" sort="Tseng, Yao Lung" uniqKey="Tseng Y" first="Yao-Lung" last="Tseng">Yao-Lung Tseng</name>
</author>
<author>
<name sortKey="Fan, Kuo Chin" sort="Fan, Kuo Chin" uniqKey="Fan K" first="Kuo-Chin" last="Fan">Kuo-Chin Fan</name>
</author>
<author>
<name sortKey="Wang, An Bang" sort="Wang, An Bang" uniqKey="Wang A" first="An-Bang" last="Wang">An-Bang Wang</name>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:1810EFB0B5D295E915FC26E7B2C76832A476B30E</idno>
<date when="1995" year="1995">1995</date>
<idno type="doi">10.1016/0167-8655(95)00054-K</idno>
<idno type="url">https://api.istex.fr/document/1810EFB0B5D295E915FC26E7B2C76832A476B30E/fulltext/pdf</idno>
<idno type="wicri:Area/Istex/Corpus">000474</idno>
<idno type="wicri:Area/Istex/Curation">000467</idno>
<idno type="wicri:Area/Istex/Checkpoint">001D58</idno>
<idno type="wicri:doubleKey">0167-8655:1995:Han C:coarse:classification:of</idno>
<idno type="wicri:Area/Main/Merge">002B58</idno>
<idno type="wicri:Area/Main/Curation">002A01</idno>
<idno type="wicri:Area/Main/Exploration">002A01</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title level="a">Coarse classification of Chinese characters via stroke clustering method</title>
<author>
<name sortKey="Han, Chin Chuan" sort="Han, Chin Chuan" uniqKey="Han C" first="Chin-Chuan" last="Han">Chin-Chuan Han</name>
<affiliation wicri:level="1">
<country xml:lang="fr">République populaire de Chine</country>
<wicri:regionArea>Institute of Computer Science and Information Engineering, National Central University, Chung-Li 32054, Taiwan</wicri:regionArea>
<wicri:noRegion>Taiwan</wicri:noRegion>
</affiliation>
</author>
<author>
<name sortKey="Tseng, Yao Lung" sort="Tseng, Yao Lung" uniqKey="Tseng Y" first="Yao-Lung" last="Tseng">Yao-Lung Tseng</name>
<affiliation wicri:level="1">
<country xml:lang="fr">République populaire de Chine</country>
<wicri:regionArea>Institute of Computer Science and Information Engineering, National Central University, Chung-Li 32054, Taiwan</wicri:regionArea>
<wicri:noRegion>Taiwan</wicri:noRegion>
</affiliation>
</author>
<author>
<name sortKey="Fan, Kuo Chin" sort="Fan, Kuo Chin" uniqKey="Fan K" first="Kuo-Chin" last="Fan">Kuo-Chin Fan</name>
<affiliation wicri:level="1">
<country xml:lang="fr">République populaire de Chine</country>
<wicri:regionArea>Institute of Computer Science and Information Engineering, National Central University, Chung-Li 32054, Taiwan</wicri:regionArea>
<wicri:noRegion>Taiwan</wicri:noRegion>
</affiliation>
<affiliation wicri:level="1">
<country wicri:rule="url">Taïwan</country>
</affiliation>
</author>
<author>
<name sortKey="Wang, An Bang" sort="Wang, An Bang" uniqKey="Wang A" first="An-Bang" last="Wang">An-Bang Wang</name>
<affiliation wicri:level="1">
<country xml:lang="fr">République populaire de Chine</country>
<wicri:regionArea>Institute of Computer Science and Information Engineering, National Central University, Chung-Li 32054, Taiwan</wicri:regionArea>
<wicri:noRegion>Taiwan</wicri:noRegion>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series>
<title level="j">Pattern Recognition Letters</title>
<title level="j" type="abbrev">PATREC</title>
<idno type="ISSN">0167-8655</idno>
<imprint>
<publisher>ELSEVIER</publisher>
<date type="published" when="1995">1995</date>
<biblScope unit="volume">16</biblScope>
<biblScope unit="issue">10</biblScope>
<biblScope unit="page" from="1079">1079</biblScope>
<biblScope unit="page" to="1089">1089</biblScope>
</imprint>
<idno type="ISSN">0167-8655</idno>
</series>
<idno type="istex">1810EFB0B5D295E915FC26E7B2C76832A476B30E</idno>
<idno type="DOI">10.1016/0167-8655(95)00054-K</idno>
<idno type="PII">0167-8655(95)00054-K</idno>
</biblStruct>
</sourceDesc>
<seriesStmt>
<idno type="ISSN">0167-8655</idno>
</seriesStmt>
</fileDesc>
<profileDesc>
<textClass></textClass>
<langUsage>
<language ident="en">en</language>
</langUsage>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">In this paper, we propose a stroke clustering-based coarse classification mechanism to classify the multi-fonts Chinese characters. The main purpose of the proposed method is to identify the associating type of an input character together with the extraction of its embedded composing components. In this paper, the K-mean clustering algorithm is employed to cluster the thinned strokes. Besides, mis-clustered stroke modification techniques are developed to rearrange the mis-clustered strokes generated by the K-mean algorithm. Five kinds of fonts for 2500 frequently used Chinese characters are tested in our experiments. The average classification rate is 92.57% which is very promising for coarse classification.</div>
</front>
</TEI>
<affiliations>
<list>
<country>
<li>République populaire de Chine</li>
<li>Taïwan</li>
</country>
</list>
<tree>
<country name="République populaire de Chine">
<noRegion>
<name sortKey="Han, Chin Chuan" sort="Han, Chin Chuan" uniqKey="Han C" first="Chin-Chuan" last="Han">Chin-Chuan Han</name>
</noRegion>
<name sortKey="Fan, Kuo Chin" sort="Fan, Kuo Chin" uniqKey="Fan K" first="Kuo-Chin" last="Fan">Kuo-Chin Fan</name>
<name sortKey="Tseng, Yao Lung" sort="Tseng, Yao Lung" uniqKey="Tseng Y" first="Yao-Lung" last="Tseng">Yao-Lung Tseng</name>
<name sortKey="Wang, An Bang" sort="Wang, An Bang" uniqKey="Wang A" first="An-Bang" last="Wang">An-Bang Wang</name>
</country>
<country name="Taïwan">
<noRegion>
<name sortKey="Fan, Kuo Chin" sort="Fan, Kuo Chin" uniqKey="Fan K" first="Kuo-Chin" last="Fan">Kuo-Chin Fan</name>
</noRegion>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 002A01 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 002A01 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    OcrV1
   |flux=    Main
   |étape=   Exploration
   |type=    RBID
   |clé=     ISTEX:1810EFB0B5D295E915FC26E7B2C76832A476B30E
   |texte=   Coarse classification of Chinese characters via stroke clustering method
}}

Wicri

This area was generated with Dilib version V0.6.32.
Data generation: Sat Nov 11 16:53:45 2017. Site generation: Mon Mar 11 23:15:16 2024